Reusable Software Infrastructure for Stream Processing

نویسنده

  • Robert Soulé
چکیده

Developers increasingly use streaming languages to write their data processing applications. Whilea variety of streaming languages exist, each targeting a particular application domain, they are allsimilar in that they represent a program as a graph of streams (i.e. sequences of data items) andoperators (i.e. data transformers). They are also similar in that they must process large volumesof data with high throughput. To meet this requirement, compilers of streaming languages mustprovide a variety of streaming-specific optimizations, including automatic parallelization. Tradi-tionally, when many languages share a set of optimizations, language implementors translate thesource languages into a common representation called an intermediate language (IL). Becauseoptimizations can modify the IL directly, they can be re-used by all of the source languages,reducing the overall engineering effort. However, traditional ILs and their associated optimiza-tions target single-machine, single-process programs. In contrast, the kinds of optimizations thatcompilers must perform in the streaming domain are quite different, and often involve reasoningacross multiple machines. Consequently, existing ILs are not suited to streaming languages.This thesis addresses the problem of how to provide a reusable infrastructure for stream pro-cessing languages. Central to the approach is the design of an intermediate language specificallyfor streaming languages and optimizations. The hypothesis is that an intermediate languagedesigned to meet the requirements of stream processing can assure implementation correctness;reduce overall implementation effort; and serve as a common substrate for critical optimizations.In evidence, this thesis provides the following contributions: (1) a catalog of common streamingoptimizations that helps define the requirements of a streaming IL; (2) a calculus that enablesreasoning about the correctness of source language translation and streaming optimizations; and(3) an intermediate language that preserves the semantics of the calculus, while addressing theimplementation issues omitted from the calculus. This work significantly reduces the effort ittakes to develop stream processing languages by making optimizations reusable across languages,and jump-starts innovation in language and optimization design.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PuLSE-I: Deriving Instances from a Product Line Infrastructure

Reusing assets during application engineering promises to improve the efficiency of systems development. However, in order to benefit from reusable assets, application engineering processes must incorporate when and how to use the reusable assets during single system development. However, when and how to use a reusable asset depends on what types of reusable assets have been created. Product li...

متن کامل

Software Process Modeling using Functional Language Miranda

The aim of our research is to manipulate a software process on a computer system with the assistance of mathematical formality. To attain this aim, we de ne the cooperation which is an important concept in a software process using a functional language Miranda. In Miranda, we can represent naturally software process behavior and an interaction between humans in a software process using higher o...

متن کامل

Software Radar Signal Processing

Software infrastructure is a growing part of modern radio science systems. As part of developing a generic infrastructure for implementing Software Radar systems, we have developed a set of reusable signal processing components. These components are generic software based implementations for use on general purpose computing systems. The components allow for the implementation of signal processi...

متن کامل

Implementations of Signal Processing Kernels using Stream Virtual Machine for Raw Processor

Stream processing exploits the properties of the stream applications such as parallelism and regularity. DARPA’s Polymorphous Computing Architectures (PCA) program is developing both hardware and software that support stream (and thread) processing with a two-level compiler infrastructure. The Morphware Forum was formed to develop standard software interfaces to promote common interfaces and so...

متن کامل

Distributed data stream processing and edge computing: A survey on resource elasticity and future directions

Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, wearable assistance, and Internet of Things, continuous data streams must be processed under very short delays. Several solutions, including multiple software engines, have been developed for processing unbounded data streams in a scalable and efficient manner. More recently, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012